Genes coding for tomato beta-galactosidase polypeptides

ABSTRACT

A novel β-galactosidase gene family and DNA sequences derived from the cloning of cDNAs encoding products of these genes are provided, as exemplified by a β-galactosidase II protein which is encoded by a cDNA clone, pZBG2-1-4. A method for modifying cell wall metabolism which involves modifying the activity of at least one β-galactosidase, and thus modifying the quality of the fruit is also provided. Also provided by the present invention is a DNA construct including some or all of a β-galactosidase DNA sequence under control of a transcriptional initiation region operative in plants, so that the construct can generate RNA and, optionally, β-galactosidase polypeptide in plant cells. The present invention also relates to recombinant vectors, which include the isolated nucleic acid molecules of the present invention, and to host cells containing the recombinant vectors, as well as to methods of making such vectors and host cells and for using them for production of β-galactosidase polypeptides or peptides by recombinant techniques. The present invention also provides plant cells containing DNA constructs of the present invention; plants derived therefrom having modified β-galactosidase gene expression; and seeds produced from such plants.

FIELD OF THE INVENTION

The present invention relates to a family of novel plant genes encoding polypeptides characterized by their ability to hydrolyze terminal non-reducing β-D-galactosyl residues from β-D-galactosides. More specifically, a polynucleotide sequence derived from a cDNA clone designated pZBG2-1-4 (referred to in U.S. Provisional Application No. 60/088,805 as pTomβgal 4), which encodes a specific plant polypeptide named β-galactosidase II, is provided. Also provided are cDNA clones encoding six other homologous polypeptides, methods of using these cDNA clones for producing β-D-galactoside polypeptides of the invention, and methods of modifying fruit quality by employment of a polynucleotide or polypeptide of the present invention.

BACKGROUND OF THE INVENTION

The most conspicuous and important processes related to post-harvest quality of climacteric fruit are the changes in texture, color, taste, and aroma which occur during ripening. Because of the critical relationship that deleterious changes in texture have to quality and post-harvest shelf-life, emphasis has been placed on studying the mechanisms involved in the loss of firmness that occurs during tomato fruit ripening. Although fruit softening may involve changes in turgor pressure, anatomical characteristics and cell wall integrity, it is generally assumed that cell wall disassembly leading to a loss of wall integrity is a critical feature. The most apparent changes, in terms of composition and size, occur in the pectic fraction of the cell wall (see references in Seymour and Gross, 1996. Postharvest News Info. 7: 45N-52N).

Changes known to occur in the pectic fraction- of the cell wall during fruit ripening include increased solubility, depolymerization, de-esterification and a significant net loss of neutral sugar containing side chains (Huber, D. J. 1983. Hort. Rev. 5: 169-219; Fischer and Bennett. 1991. Annu. Rev. Plant Physiol. Plant Mol. Bio. 42: 675-703; Seymour and Gross, supra). The best characterized pectin-modifying enzymes are polygalacturonase (endo-α1→4-D-galacturonan hydrolase; E.C. 3.2.1.15; PG) and pectin methylesterase (E.C. 3.1.1.11; PME). Although PG and PME are relatively abundant and have substantial activity during tomato fruit ripening, softening still occurs, albeit with a slight delay, in fruit where PG (Smith et al. 1988. Nature 334: 724-726) or PME (Tieman et al. 1992. Plant Cell 4: 667-679; Hall et al. 1993. Plant J. 3: 121-129) gene expression and enzyme activity was significantly down-regulated in transgenic plants. Moreover, over-expression of PG in non-ripening mutant rin tomato fruit did not result in softening even though depolymerization and solubilization of pectin was evident (Giovannoni et al. 1989. Plant Cell 1: 53-63).

Among the other known pectin modifications that occur during fruit development, one of the best characterized is the significant net loss of galactosyl residues which occurs in the cell walls of many ripening fruit (Gross and Sams. 1984. Phytochem. 23: 2457-2461; Seymour and Gross, supra). Although some loss of galactosyl residues could result indirectly from the action of PG, β-galactosidase (exo-β(1→4)-D-galactopyranoside; E.C. 3.2.1.23) is the only enzyme identified in higher plants capable of directly cleaving β(1→4) galactan bonds, and probably plays a role in galactan sidechain loss (DeVeau et al. 1993. Physiol. Plant 87: 279-285; Carey et al. 1995. Plant Physiol. 108: 1099-1107; Carrington and Pressey, 1996. J. Amer. Soc. Hort. Sci. 121: 132-136). No endo-acting galactanase has yet been identified in higher plants. The view that β-galactosidase is active in releasing galactosyl residues from the cell wall during ripening is supported by the dramatic increase in free galactose, a product of β-galactosidase activity (Gross, K. C. 1984. Physiol. Plant 62: 25-32) and a concomitant increase in activity of a particular enzyme, designated β-galactosidase II, in tomatoes during ripening (Carey et al., supra). β-galactosidase activity is thought to be important in cell wall metabolism (Carey et al., supra). β-galactosidases are generally assayed using artificial substrates such as p-nitrophenyl-β-D-galactopyranoside (PNP), 4-methylumbelliferyl-β-D-galactopyranoside and 5-bromo-4-chloro-3-indoxyl-β-D-galactopyranoside (X-GAL). However, it is clear that β-galactosidase II is also active against natural substrates, i.e., β (1→4) galactan (Carey et al., supra; Carrington and Pressey, supra; Pressey, R. 1983. Plant Physiol. 71: 132-135). β-galactosidase proteins have been purified and characterized in a number of other fruits including kiwifruits (Ross et al. 1993. Planta 189: 499-506), coffee (Golden et al. 1993. Phytochem. 34: 355-360), persimmon (Kang et al. 1994. Plant Physiol. 105:975-979), and apple (Ross et al. 1994. Plant Physiol. 106: 521-528).

Carey et al. (supra) were able to purify one of the three previously identified β-galactosidases from ripening tomato fruit (Pressey, supra), but only one β-galactosidase II) was active against β(1→4) galactan. Even though they were able to identify putative β-galactosidase cDNA clones, none of the cDNA's deduced amino acid sequences matched the amino terminal sequence of the β-galactosidase II protein. Although β-galactosidase II, a protein present in tomato (Lycopersicon esculentum Mill.) fruit during ripening and capable of degrading tomato fruit galactan has been purified, cloning of the corresponding gene has been elusive.

The modification of plant gene expression has been achieved by several methods. The molecular biologist can choose from a range of known methods to decrease or increase gene expression or to alter the spatial or temporal expression of a particular gene. For example, the expression of either specific antisense RNA or partial (truncated) sense RNA has been utilized to reduce the expression of various target genes in plants (as reviewed by Bird and Ray. 1991. Biotechnology and Genetic-Engineering Reviews 9: 207-227). These techniques involve the incorporation into the genome of the plant of a synthetic gene designed to express either antisense or sense RNA. They have been successfully used to down-regulate the expression of a range of individual genes involved in the development and ripening of tomato fruit (Gray et al. 1992. Plant Molecular Biol. 9: 69-87). Methods to increase the expression of a target gene have also been developed. For example, additional genes designed to express RNA containing the complete coding region of the target gene may be incorporated into the genome of the plant to “over-express” the gene product. Various other methods to modify gene expression are known; for example, the use of alternative regulatory sequences. The complete disclosure of each of the references cited above is fully incorporated herein by reference.

The need therefore exists to clone a gene for β-galactosidase II and related polypeptides, and using known methods of modification of plant gene expression, thereby to provide methods for modifying quality of fruits, particularly by modifying the cell wall, thereby directly affecting the ripening of the fruit.

SUMMARY OF THE INVENTION

The present invention is based on the discovery of novel DNA sequences derived from cDNA clones from a family of genes encoding β-galactosidases. The phylogenic tree based on the shared amino acid sequence identities for the DNA sequences of the present invention is shown in FIG. 1A,B. Five cDNA and two RT-PCR clones, designated herein as TBG1, TBG2, TBG3, TBG4, TBG5, TBG6, and TBG7 and having the nucleic acid sequences designated SEQ ID NOs 1-7, respectively as shown in FIG. 2, were identified which had a high degree of shared sequence identity to other known β-galactosidases. The corresponding amino acid sequences are designated herein as SEQ ID NOs 8-16, respectively and are shown in FIGS. 2 and 3.

The nucleotide sequences for SEQ ID NOs 1-7 are recorded in Gen Bank with the following respective Accessions Numbers:

-   SEQ ID NO:1 TGB1 AF023847 deposited Sep. 10, 1997 -   SEQ ID NO:2 TGB2 AF154420 deposited May 19, 1999 -   SEQ ID NO:3 TGB3 AF154421 deposited May 20, 1999 -   SEQ ID NO:4 TGB4 AF020390 deposited Aug. 21, 1997 -   SEQ ID NO:5 TGB5 AF154423 deposited May 20, 1999 -   SEQ ID NO:6 TGB6 AF154424 deposited May 20, 1999 -   SEQ ID NO:7 TGB7 AF154422 deposited May 20, 1999

Throughout the following discussion, wherever TBG4 is indicated in the description of the invention, it is to be understood that TBG1-3 and 5-7 are also to be included in that description, unless otherwise indicated.

A method of providing a DNA sequence of the invention, either by cloning a cDNA (for instance, pZBG2-1-4) that codes for a protein of the present invention, such as β-galactosidase II, or by deriving the DNA sequence from genomic DNA, or by synthesis of a DNA sequence ab initio using the cDNA sequence as a guide is also provided.

A method for modifying cell wall metabolism which involves modifying the activity of at least one galactosidase, and thus modifying the quality of the fruit is also provided.

Also provided by the present invention is a DNA construct including some or all of an exemplary β-galactosidase DNA sequence under control of a transcriptional initiation region operative in plants, so that the construct can generate RNA in plant cells.

Also discovered is a promoter gene associated with expression of the genes encoding β-galactosidase.

The present invention also relates to recombinant vectors, which include the isolated nucleic acid molecules of the present invention, and to host cells containing the recombinant vectors, as well as to methods of making such vectors and host cells and for using them for production of β-galactosidase polypeptides or peptides by recombinant techniques.

The present invention also provides plant cells containing DNA constructs of the present invention; plants derived therefrom having modified β-galactosidase gene expression; and seeds produced from such plants.

The β-galactosidase II protein of the present invention has demonstrated enzyme activity in cell wall disassembly leading to loss of tissue integrity and fruit softening. The β-galactosidase II protein also may be involved in cell turnover, which could be involved in cell extension and/or expansion and therefore plant growth and development.

By hydrolyzing galactose from the cell wall, the enzyme may allow ripening to commence and/or progress, since galactose may be involved in stimulating ethylene production alone or in conjunction with unconjugated N-glycans.

The β-galactosidase of the invention may be involved in conversion of chloroplasts (green—chlorophyll) to chromoplasts (red—lycopene) during fruit ripening by degrading chloroplast membrane galactolipids.

The family of genes represented by the nucleotide sequences shown in FIG. 2 is expected to code for a group of similar enzymes with the same type of hydrolytic activity but with different tissue and/or substrate specificities or cellular compartmentation profiles.

The β-galactosidase II protein of the present invention as well as other proteins encoded in the nucleotide sequences shown in FIG. 2 may be used for preparation of pectin and other cell wall derived polymers with lowered galactosyl content for use in biofilms and solutions (for example in clarification of fruit juices) requiring lower or higher cross-linking or viscomertric properties.

The present invention also provides β-galactosidase enzymes for use as components of enzyme mixtures for protoplast isolation.

BRIEF DESCRIPTION OF THE FIGURES

FIGS. 1A and 1B shows a phylogenic tree based on shared amino acid sequence identity among tomato β-galactosidase clones TGB1-7 and other known plant β-galactosidase polypeptides.

FIG. 2 shows cDNA sequences [SEQ ID NOs: 1-7, respectively] for the seven β-galactosidase genes of the invention: TGB1, TGB2, TGB3, TGB4, TGB5, TGB6, TGB7.

FIG. 3 shows multiple sequence alignment of the N-terminal amino acid sequences of β-galactosidase II protein from tomato fruit for TGB1, TGB2, TGB3, TGB4, TGB5, TGB6 and TGB7 [SEQ ID NOs: 8-16, respectively] and various plant β-galactosidase cDNA clones.

FIG. 4 shows autoradiograph of northern blot analysis of TBG expression in various plant tissues (flowers, leaves, roots and stems).

FIG. 5 shows autoradiograph of northern blot analysis of TBG expression in fruit tissues at different stages of development.

FIG. 6 shows autogradiograph of northern blot analysis of TBG expression in fruit tissues (mature green or turning stage fruit peel, outer pericarp, inner paricarp and locular).

FIG. 7 shows autoradiograph of northern blot analysis of TBG expression in normal and mutant fruit tissues.

FIG. 8 shows autoradiograph of northern blot analysis of TBG expression in response to ethylene treatment of mature green fruit tissues.

FIG. 9 shows Western blot analysis of TBG4 expression by yeast.

FIG. 10 shows detection of β-galactosidase activity from pZBG2-1-4 expression in E. coli.

FIGS. 11A-E shows the comparative results of texture measurements for fruit from tomato plants containing antisense constructs to suppress TBG4 mRNA and fruit from the parental line.

FIGS. 12A-B show Northern blot analysis of TBG4 expression in transgenic fruit containing TBG4 antisense construct.

FIG. 13 shows a Binary construct used to transform plants and express TBG4 (pZBG2-1-4) in the antisense orientation.

DETAILED DESCRIPTION

The following detailed description is directed to a preferred embodiment of the present invention and is intended as illustrative of each of other DNA sequences of the present invention.

The present invention provides isolated nucleic acid molecules comprising a polynucleotide encoding β-galactosidase polypeptides, particularly a β-galactosidase II polypeptide having the amino acid sequence shown in FIG. 2. The DNA sequence of the exemplary β-galactosidase II cDNA clone of the invention, which was determined from a cDNA clone, pZBG2-1-4, encoding β-galactosidase II, is recorded in GenBank as Accession Number AF020390. Not all β-galactosidases possess in vitro activity against extracted cell wall material via the release of galactose from wall polymers containing β (1→4)-D-galactan. The polypeptide expressed from the exemplary β-galactosidase II clone, pZBG2-1-4, has been shown to exhibit β-galactosidase activity and exogalactinase activity.

The exemplary β-galactosidase II protein of the present invention, as shown in FIG. 2, shares sequence homology with the amino acid sequence deduced from β-galactosidase cDNA clones of TBG2-7 and cDNA clones of the fruits of asparagus (accession number P45582), apple (accession number P48981), and carnation (accession number Q00662), as well as with β-galactosidase cDNA clones of a previously published sequence of a tomato β-galactosidase cDNA clone designated pTombgal 1 (accession number P48980) isolated from ripe ‘Ailsa Craig’ fruit (Carey et al., supra). The ORF of the clone TBG1 disclosed herein by the inventors (accession number AF023847) is nearly identical to the cDNA previously described by Carey et al. As shown in FIG. 2, the shared deduced sequence identity is high among all the published plant β-galactosidases of the seven clones (TBG1-7) and the other plant β-galactosidases.

BLAST searches of the database also indicated significant shared sequence identity between domains of the plant β-galactosidases and mammalian and fungal β-galactosidases, however little share sequence identity was detected with bacterial β-galactosidases.

As shown in FIG. 1, the shared amino acid identity of TBG1 and TBG3 was high. TBG4 was also very similar to both TBG1 and 3. The amino acid sequences of TBG2 and 7 were unique because several regions of amino acid insertions appear throughout their sequence (FIG. 3).

Nucleic Acid Molecules

Unless otherwise indicated, all nucleotide sequences determined by sequencing a DNA molecule herein were determined using a PCR-based dideoxynucleotide terminator protocol and an ABI automated DNA sequencer (such as the Model 373 from Applied Biosystems, Inc., Foster City, Calif.), and all amino acid sequences of polypeptides encoded by DNA molecules determined herein were predicted by translation of a DNA sequence determined as above. Therefore, as is known in the art for any DNA sequence determined by this automated approach, any nucleotide sequence determined herein may contain some errors. Nucleotide sequences determined by automation are typically at least about 90% identical, more typically at least about 95% to at least about 99.9% identical to the actual nucleotide sequence of the sequenced DNA molecule. The actual sequence can be more precisely determined by other approaches including manual DNA sequencing methods well known in the art. As is also known in the art, a single insertion or deletion in a determined nucleotide sequence compared to the actual sequence will cause a frame shift in translation of the nucleotide sequence such that the predicted amino acid sequence encoded by a determined nucleotide sequence will be completely different from the amino acid sequence actually encoded by the sequenced DNA molecule, beginning at the point of such an insertion or deletion.

By “nucleotide sequence” of a nucleic acid molecule or polynucleotide is intended, for a DNA molecule or polynucleotide, a sequence of deoxyribonucleotides, and for an RNA molecule or polynucleotide, the corresponding sequence of ribonucleotides (A, G, C and U), where each thymidine deoxyribonucleotide (T) in the specified deoxyribonucleotide sequence is replaced by the ribonucleotide uridine (U).

Using the information provided herein, such as the exemplary nucleotide sequence shown in FIG. 2 [SEQ ID NO: 4], a nucleic acid molecule of the present invention encoding a β-galactosidase II polypeptide may be obtained using standard cloning and screening procedures, such as those for cloning cDNAs using mRNA as starting material. Illustrative of the invention, the nucleic acid molecule described in FIG. 2 [SEQ ID NO: 4] was discovered in a cDNA library derived from breaker, turning and pink fruit pericarp from ‘Rutgers’ tomato plants. A second library comprised poly(A)RNA isolated from all fruit tissues (except seeds) from immature green, mature green, breaker, turning, pink, red-ripe and over-ripe fruit of ‘Rutgers’ plants.

The complete sequence of the cDNA insert of pZBG2-1-4 is accessible in the GenBank (no. AF020390) and is provided in FIG. 2 [SEQ ID NO: 4]. The cDNA insert is 2532 nucleotides (nt) long and contains a single, long open reading frame (ORF) predicted to start with the first in-frame ATG at nt 64 and end with TAA at nt 2238. This ORF codes for a 79 kD protein 724 amino acids long. The deduced amino acid sequence of pZBG2-1-4 shared significant amino acid identity to all published plant β-galactosidase sequences in the database (FIGS. 1A,B). When the entire ORF of each β-galactosidase gene was compared to pZBG2-1-4, the shared sequence identity was about 64% for tomato pTombgal 1 (P48980), about 67.6% for apple (P48981), about 63% for asparagus (P45582) and about 55% for carnation (Q00662). As one of ordinary skill would appreciate, due to the possibilities of sequencing errors discussed above, the actual complete β-galactosidase II polypeptide encoded by the deposited cDNA, which comprises about 724 amino acids, may be somewhat longer or shorter. More generally, the actual open reading frame may be anywhere in the range of ±20 amino acids, more likely in the range of ±10 amino acids, of that predicted from either the first methionine codon from the N-terminus shown in FIG. 2 [SEQ ID NO: 4]. In any event, as discussed further below, the invention further provides polypeptides having various residues deleted from the N-terminus of the complete polypeptide, including polypeptides lacking one or more amino acids from the N-terminus of the β-galactosidase II polypeptide described herein.

Leader and Mature Sequences

Analysis of the deduced amino acid sequence of pZBG2-1-4 suggested a high probability for secretion based on the presence of a hydrophobic leader sequence, a leader sequence cleavage site and three possible N-glycosylation sites. The programs PSORT V6.4 (Nakai and Kanehisa. 1992. Genomics 14: 897-911, incorporated herein by reference) and SignalP V1.1 (Nielsen et al. 1997. Protein Engineering 10: 1-6, incorporated herein by reference), were used to predict that the ORF contains a hydrophobic leader sequence that would be cleaved between the alanine and serine residues at positions 23 and 24 respectively, and that the mature polypeptide has an extracellular location. The mature polypeptide contains three possible N-glycosylation sites at asparagine numbers 282, 459 and 713, however the asparagine at position 713 is unlikely to be glycosylated due to the proline at position 714. The predicted molecular mass of the unglycosylated mature polypeptide was 75 kD with a pi of 8.9.

Accordingly, the amino acid sequence of the complete β-galactosidase II protein of the invention includes a leader sequence and a mature protein, as shown in FIG. 3 [SEQ ID NO: 4]. More in particular, the present invention provides nucleic acid molecules encoding a mature form of the β-galactosidase II protein. Thus, according to the signal hypothesis, secreted proteins have a signal or secretory leader sequence which is cleaved from the complete polypeptide to produce a secreted “mature” form of the protein. In some cases, cleavage of a secreted protein is not entirely uniform, which results in two or more mature species of the protein. Further, it has long been known that the cleavage specificity of a secreted protein is ultimately determined by the primary structure of the complete protein, that is, it is inherent in the amino acid sequence of the polypeptide. Therefore, the present invention provides a nucleotide sequence encoding the mature β-galactosidase II polypeptide having the amino acid sequence encoded by the cDNA shown in FIG. 2 [SEQ ID NO: 4] and provided in GenBank (Accession No. AF20390). By the “mature β-galactosidase II polypeptide having the amino acid sequence encoded by the cDNA clone shown in FIG. 2 [SEQ ID NO: 4] is meant the mature form(s) of the β-galactosidase II protein produced by expression in a plant cell of the complete open reading frame encoded by the cDNA sequence of the clone shown in FIG. 2 [SEQ ID NO: 4] and provided in GenBank (Accession No. AF20390).

The exemplary β-galactosidase II cDNA of the present invention (TBG4) has been expressed in E. coli strain XLI blue MR (lacZ) (Stratagene, La Jolla, Calif.), as described hereinbelow (see Example).

Analysis of the deduced amino acid sequence of cDNA clones representing the other genes of the β-galactosidase of the invention also revealed open reading frames and, in some cases, suggested a high probability for secretion of the encoded proteins. All the full-length cDNA clones were predicted to have a signal sequence (FIG. 2). Using the two prediction programs SignalP and PSORT, TBG4 was predicted to be secreted by both programs. TBG1, 2 and 3 were predicted to have cleavable signal sequences by SignalP, but uncleavable signal sequences by PSORT. TBG7 was suggested to be targeted to the chloroplast by PSORT. Particular observations for each of the seven clones are as follows, based on the presence of a hydrophobic leader predicted by the programs PSORT V6. and SignalP V1.1: TBG1: initiation codon at 306 [SEQ ID NO: 1], ORF=835 amino acids [SEQ ID NO: 8], signal sequence at 1-24; TBG2: initiation codon not determined [SEQ ID NO: 2], ORF=888 amino acids [SEQ ID NO: 9], signal sequence at 1-25; TBG3: initiation codon at 32 [SEQ ID NO: 3], ORF=838 amino acids [SEQ ID NO: 10], signal sequence at 1-22; TBG5: initiation codon not determined [SEQ ID NO:5], ORF=251 amino acids [SEQ ID NO: 12], signal sequence not determined; TBG6: initiation codon not determined [SEQ ID NO:6], ORF=248 amino acids [SEQ ID NO:13], signal sequence not determined; TBG7: initiation codon at 104 [SEQ ID NO: 7], ORF=870 amino acids [SEQ ID NO:14], signal sequence at 1-35.

The deduced amino acid sequences of the seven clones was also subjected to analysis using the program DNAsis and the predictions for molecular mass, cellular targeting, pl and potential N-linked glycosylation sites are summarized in Table I. TABLE I Tomato β-galactosidase (TBG) cDNA sequence data. Five full-length and two partial-length cDNAs were cloned and sequenced. The DNA and deduced amino acid sequence data are presented below. CLONE mRNA (kb) kD pl N-LINK^(a) TARGET TBG1 3.2 90.8 6.2 2 ER/OUT^(b) TBG2 3.0 97.0 6.2 6 PM^(c) TBG3 2.8 90.5 8.2 1 ER/OUT TBG4 2.6 77.9 8.9 3 OUT TBG5 ˜3 TBG6 ˜3 TBG7 3.0 93.3 8.0 6 CHLOR^(d) ^(a)Possible N-linked glycosylation sites ^(b)Endoplasmic Reticulum; OUT = Secreted ^(c)Tethered to Plasma Membrane ^(d)Chloroplast

As indicated, nucleic acid molecules of the present invention may be in the form of RNA, such as mRNA, or in the form of DNA, including, for instance, cDNA and genomic DNA obtained by cloning or produced synthetically. The DNA may be double-stranded or single-stranded. Single-stranded DNA or RNA may be the coding strand, also known as the sense strand, or it may be the non-coding strand, also referred to as the anti-sense strand.

By “isolated” nucleic acid molecule(s) is intended a nucleic acid molecule, DNA or RNA, which has been removed from its native environment. For example, recombinant DNA molecules contained in a vector are considered isolated for the purposes of the present invention. Further examples of isolated DNA molecules include recombinant DNA molecules maintained in heterologous host cells or purified (partially or substantially) DNA molecules in solution. Isolated RNA molecules include in vivo or in vitro RNA transcripts of the DNA molecules of the present invention. Isolated nucleic acid molecules according to the present invention further include such molecules produced synthetically.

Isolated nucleic acid molecules of the present invention include DNA molecules comprising an open reading frame (ORF) with an initiation codon at position 64 of the nucleotide sequence shown in FIG. 2 [SEQ ID NO: 4]. Also included are DNA molecules comprising the coding sequence for the mature β-galactosidase II protein shown at positions 135-2532 of FIG. 2 [SEQ ID NO: 4].

In addition, isolated nucleic acid molecules of the invention include DNA molecules which comprise a sequence substantially different from those described above but which, due to the degeneracy of the genetic code, still encode the β-galactosidase II protein. Of course, the genetic code and species-specific codon preferences are well known in the art. Thus, it would be routine for one skilled in the art to generate the degenerate variants described above, for instance, to optimize codon expression for a particular host (e.g., change codons in the plant mRNA to those preferred by a bacterial host such as E. coli). Preferably, this nucleic acid molecule will encode the mature polypeptide encoded by the above-described deposited cDNA clone.

The invention further provides an isolated nucleic acid molecule having the nucleotide sequence shown in FIG. 2 [SEQ ID NO: 4] or a nucleic acid molecule having a sequence complementary to the above sequence. Such isolated molecules, particularly DNA molecules, are useful as probes for gene mapping, by in situ hybridization with chromosomes, and for detecting expression of the β-galactosidase II gene in plant tissue, for instance, by Northern blot analysis.

The present invention is further directed to nucleic acid molecules encoding portions of the nucleotide sequences described herein as well as to fragments of the isolated nucleic acid molecules described herein. In particular, the invention provides a polynucleotide having a nucleotide sequence representing the portion of FIG. 2 [SEQ ID NO: 4] which consists of positions 1-2538 of FIG. 2 [SEQ ID NO: 4].

In addition, the invention provides additional nucleic acid molecules having nucleotide sequences related to extensive portions of FIG. 2 [SEQ ID NO: 4] which have been determined from the following related cDNA clones: TBG1-3 and TBG5-7 as shown in FIG. 3, SEQ. NO's 1-3 and 5-7.

In another aspect, the invention provides an isolated nucleic acid molecule comprising a polynucleotide which hybridizes under stringent hybridization conditions to a portion of the polynucleotide in a nucleic acid molecule of the invention described above, for instance, the cDNA clone shown in FIG. 2 [SEQ ID NO: 4]. By “stringent hybridization conditions” is intended overnight incubation at 42° C. in a solution comprising: 50% formamide, 5×SSC (150 mM NaCl, 15 mM trisodium citrate), 50 mM sodium phosphate (pH 7.6), 5×Denhardt's solution, 10% dextran sulfate, and 20 μg/ml denatured, sheared salmon sperm DNA, followed by washing the filters in 0.1×SSC at about 65° C.

As indicated, nucleic acid molecules of the present invention which encode β-galactosidase II polypeptide may include, but are not limited to those encoding the amino acid sequence of the mature polypeptide, by itself; and the coding sequence for the mature polypeptide and additional sequences, such as those encoding the about 1-23 amino acid leader sequence, such as a pre-, or pro- or prepro- protein sequence; the coding sequence of the mature polypeptide, with or without the aforementioned additional coding sequences.

Also discovered is an enhancer/promoter associated with expression of the genes encoding β-galactosidase. The inventors have characterized the expression profile of TBG2 mRNA and have cloned a lambda genomic cDNA. TBG2 is expressed before the onset of fruit ripening and continues at uniform level throughout all the ripening stages. TBG2 has been found to be expressed in all fruit tissues and has also been found to be fruit specific. Experiments have shown TBG2 to be unaffected by ethylene. TBG2 is expressed in the ripening mutants rin, nor and Nrat the normal chronological time after anthesis. The promoter discovered would be useful to express any gene in the sense or antisense orientation, specifically in tomato fruit, in all tomato fruit tissues, starting before and continuing throughout the entire ripening process. The promoter could also be used to express any gene in the ripening mutants rin, nor and Nrwithout the need to gas the fruit with exogenous ethylene.

Variant and Mutant Polynucleotides

The present invention further relates to variants of the nucleic acid molecules of the present invention, which encode portions, analogs or derivatives of the β-galactosidase II protein. Variants may occur naturally, such as a natural allelic variant. By an “allelic variant” is intended one of several alternate forms of a gene occupying a given locus on a chromosome of an organism. (Genes II, 1985, Lewin, B., ed., John Wiley & Sons, New York). Non-naturally occurring variants may be produced using art-known mutagenesis techniques.

Such variants include those produced by nucleotide substitutions, deletions or additions. The substitutions, deletions or additions may involve one or more nucleotides. The variants may be altered in coding regions, non-coding regions, or both. Alterations in the coding regions may produce conservative or non-conservative amino acid substitutions, deletions or additions. Especially preferred among these are silent substitutions, additions and deletions, which do not alter the properties and activities of the β-galactosidase II protein or portions thereof. Also especially preferred in this regard are conservative substitutions.

Most highly preferred are nucleic acid molecules encoding the mature protein having the amino acid sequence shown in FIG. 2 as pZBG2-1-4 or the mature β-galactosidase II amino acid sequence encoded by the deposited cDNA clone.

Further embodiments include an isolated nucleic acid molecule comprising a polynucleotide having a nucleotide sequence at least 90% identical, and more preferably at least 95%, 96%, 97%, 98% or 99% identical to a polynucleotide selected from the group consisting of: (a) a nucleotide sequence encoding the β-galactosidase II polypeptide having the complete amino acid sequence in FIG. 2 [SEQ ID NO: 4] (b) a nucleotide sequence encoding the mature β-galactosidase II polypeptide shown in FIG. 2 [SEQ ID NO: 4]; © a nucleotide sequence complementary to any of the nucleotide sequences in (a) or (b) above.

Vectors and Host Cells

The present invention also relates to vectors which include the isolated DNA molecules of the present invention, host cells which are genetically engineered with the recombinant vectors, and the production of β-galactosidase II polypeptides or fragments thereof by recombinant techniques. The vector may be, for example, a phage, plasmid, viral or retroviral vector. Retroviral vectors may be replication competent or replication defective. In the latter case, viral propagation generally will occur only in complementing host cells.

The polynucleotides may be joined to a vector containing a selectable marker for propagation in a host. Generally, a plasmid vector is introduced in a precipitate, such as a calcium phosphate precipitate, or in a complex with a charged lipid. If the vector is a virus, it may be packaged in vitro using an appropriate packaging cell line and then transduced into host cells.

The DNA insert should be operatively linked to an appropriate promoter, such as the phage lambda PL promoter, the E. coli lac, trp, phoA and tac promoters, the SV40 early and late promoters and promoters of retroviral LTRs, to name a few. Other suitable promoters will be known to the skilled artisan. The expression constructs will further contain sites for transcription initiation, termination and, in the transcribed region, a ribosome binding site for translation. The coding portion of the transcripts expressed by the constructs will preferably include a translation initiating codon at the beginning and a termination codon (UAA, UGA or UAG) appropriately positioned at the end of the polypeptide to be translated.

As indicated, the expression vectors will preferably include at least one selectable marker. Such markers include dihydrofolate reductase, G418 or neomycin resistance for eukaryotic cell culture and tetracycline, kanamycin or ampicillin resistance genes for culturing in E. coli and other bacteria. Representative examples of appropriate hosts include, but are not limited to, bacterial cells, such as E. coli, Streptomyces and Salmonella typhimurium cells; fungal cells, such as yeast cells; insect cells such as Drosophila S2 and Spodoptera Sf9 cells; animal cells such as CHO, COS, 293 and Bowes melanoma cells; and plant cells. Appropriate culture mediums and conditions for the above-described host cells are known in the art.

Among vectors preferred for use in bacteria include pQE70, pQE60 and pQE-9, available from QIAGEN, Inc., supra; pBS vectors, Phagescript vectors, Bluescript vectors, pNH8A, pNH16a, pNH18A, pNH46A, available from Stratagene; and ptrc99a, pKK223-3, pKK233-3, pDR540, pRIT5 available from Pharmacia. Among preferred eukaryotic vectors are pWLNEO, pSV2CAT, pOG44, pXT1 and pSG available from Stratagene; and pSVK3, pBPV, PMSG and pSVL available from Pharmacia. Other suitable vectors will be readily apparent to the skilled artisan.

Introduction of the construct into the host cell can be effected by calcium phosphate transfection, DEAE-dextran mediated transfection, cationic lipid-mediated transfection, electroporation, transduction, infection or other methods. Such methods are described in many standard laboratory manuals, such as Davis et al., Basic Methods In Molecular Biology (1986).

EXAMPLES Example 1 RNA Extraction

Tomato (Lycopersicon esculentum Mill., cv. ‘rutgers’) plants were grown in a greenhouse using standard cultural practices. The ripening mutants, ripening inhibitor (rin), non-ripening (nor) and never ripe (Nr) (Tigchelaar et al. 1978. Hort. Science 13: 508-513), were all in the ‘Rutgers’ background. Flowers were tagged at anthesis and fruit were harvested according to the number of days post-anthesis (dpa) or based on their surface color using ripeness stages as previously described (Mitcham et al. 1989. Plant Physiol. 89: 477-481), the complete disclosure of which is hereby fully incorporated herein by reference. For gene expression studies, a variety of leaf, flower, and stem tissues were harvested from greenhouse-grown plants and roots were harvested from seedlings grown in basal tissue culture medium for 4 weeks after seed germination.

Fruits were processed immediately after harvest in the greenhouse by chilling on ice, excising the various tissues and freezing them in liquid nitrogen. Tissue samples were ground using a mortar and pestle and stored at −80° C. RNA was extracted using the method described in Verwoerd et al. (1989. Nuc. Acids. Res. 17: 2362). Poly(A)RNA was purified from total RNA using oligo(dT) columns (Pharmacia, Piscataway, N.J.). RNA was quantified by measuring A₂₆₀ using a dual beam spectrophotometer.

Example 2 RT-PCR

Degenerate primers were designed based on the highest shared deduced amino acid sequence identity we found between an apple (accession number P48980), asparagus (P45582) and carnation (Q00662) β-galactosidase cDNA clones. The two primers used for the first reaction were BG5′E1 (WSNGGNWSNATHCAYTAYCC) and BG3′E (CCRTAYTCRTCNADNGGNGG). A second reaction was done on the products of the first reaction using BG5′I1 (ATHCARACNTAYGTNTTYTGG) and BG3′E. The degeneracy code for the primer sequences is N=a+t+c+g; H=a+t+c; B=t+c+g; D=a+t+g; V=a+c+g; R=a+g; Y=c+t; M=a+c; K=t+g; S=c+g; and W=a+t. The 5′ and 3′ primers corresponded to amino acids 72-78 and 321-315 of the apple clone, respectively. Amplification was done using AmpliTaq DNA polymerase (Perkin Elmer, Norwalk, Conn.) and standard PCR conditions using the cDNA made for the first cDNA library described below as a template (Ausubel et al. 1987. In: Current Protocols in Molecular Biology, John Wiley and Sons, New York, N.Y.). PCR products were separated in an agarose gel and fragments of the expected size (approximately 750 bp) were purified, cloned into pCRscript (Stratagene, La Jolla, Calif.) , and sequenced.

Example 3 cDNA Library Construction

Two cDNA libraries were constructed. The first comprised poly(A) RNA isolated from breaker, turning and pink fruit pericarp from ‘Rutgers’ plants. The cDNA synthesis and library construction was done exactly according to the manufacturers instructions for the ZAP-cDNA Gigapack II Gold Cloning Kit (Stratagene), the complete disclosure of which is fully incorporated herein by reference. First-strand cDNA synthesis was primed using a poly(dT) primer and inserts were directionally cloned into the Uni-Zap XR vector using EcoRI and XhoI restriction sites. The second library comprised poly(A) RNA isolated from all fruit tissues (except seeds) from immature green, mature green, breaker, turning, pink, red-ripe and over-ripe fruit of ‘Rutgers’ plants. The cDNA synthesis and library construction was done exactly according to the manufacturers instructions for the SuperScript Lambda System for cDNA synthesis and Cloning (Gibco BRL, Gaithersburg, Md.). First-strand cDNA synthesis was primed using a oligo(dT) primer and cDNA inserts were directionally cloned into the ZipLox cloning vector using SaII and NotI restriction sites. Both libraries were amplified and maintained using the host strains provided by the manufacturer, according to their instructions.

One of the clones (RT-PCR2-1) was used to screen 106 plaques from the tomato fruit cDNA libraries at low stringency (hybridization at 45° C., no formamide and final wash with 0.2×SSC at 42° C.). Thirty positive cDNA clones were identified and partially sequenced. Complete sequencing and characterization of the RT-PCR and cDNA clones revealed the possibility of seven unique β-galactosidase genes.

Example 4 DNA and RNA Gel Blot Analysis

Southern analysis was done using the 3′ UTR of each full length clone and the RT-PCR clones as probes against restriction enzyme digested genomic DNA. The genes corresponding to the clones appeared to be present as single copies (data not shown). The same probes were used to map 6 of the 7 genes using RFLPs of recombinant inbred lines and the loci names and map positions are shown in Table II (James Gioviannone, Texas A&M University, personal communication). TABLE II TBG loci map positions. Genes were mapped by Southern analysis using RFLPs of recombinant inbred lines. Gene Chromosome Map Position TBG1  12* Overlap of IL 12-2, IL 12-3 TBG2  9 IL 9-3 TBG3  3 IL 3-5 TBG4  12* Overlap of IL 12-2, IL 12-3 TBG5 11 IL 11-3 TBG6  2 Overlap of IL 2-4, IL 2-5 TBG7 no RFLP *TBG1 and TBG4 are loosely linked

Total RNA (20 mg/ lane) was separated in a formaldehyde/Mops agarose gel, transferred to Hybond-N⁺ membrane (Amersham, Arlington Heights, Ill.), fixed by incubating for 2 h at 80° C., hybridized overnight in a hybridization incubator (Robbins Scientific, Sunnyvale, Calif.) using a buffer described by Church and Gilbert (1984. Proc. Natl. Acad. Sci. USA 81: 1991-1995) washed to a final stringency of 0.1×SSC with 0.2% SDS at 65° C., and autoradiographed essentially as described by Ausubel et al. (supra). An RNA ladder standard (GibcoBRL) was used to estimate the length of the RNAs. Probes were synthesized using a random priming kit with ³²P-dATP as the label (Boehringer Mannheim, Indianapolis, Ind.). Northern analysis was done using the 3′ UTR of each full length clone and the RT-PCR clones as templates for probe synthesis. As a loading control, RNA blots were stripped and re-probed at a reduced hybridization and washing stringency using a soybean 26S rDNA fragment (Turano et al. 1997. Plant Physiol. 113: 1329-1341). For all hybridizations, ³²P(dATP)-labeled probe was diluted to 1-2×10⁶ dpm/mL. DNA gel blot analysis was done essentially as described in Smith and Fedoroff (1995. Plant Cell 7: 735-745) except that 3 mg of genomic DNA was used for each digest. The complete disclosures of the above references are fully incorporated herein by reference.

Example 5 Sequence Analysis

Sequencing was done at the Iowa State University Sequencing Facility (Ames, Iowa) using a PCR-based dideoxynucleotide terminator protocol and an ABI automated sequencer (Applied Biosystems, Foster City, Calif.). The sequencing of both cDNA insert strands was done by primer walking. Nucleotide and deduced amino acid sequence comparisons against the databases were done using BLAST searches (Altschul et al. 1990. J. Mol. Biol. 215: 403-410). Sequence data were analyzed and aligned using DNA Strider 1.2 (Marck, C. 1988. Nucl. Acids Res. 16: 1829-1836) and MacDNAsis (Hitachi, San Bruno, Calif.) software. The complete disclosures of the above references are fully incorporated herein by reference.

Example 6 Northern Blot Analysis

Northern blot analysis was done to reveal which, if any, of the β-galactosidase genes had a fruit-specific expression pattern. With the exception of TBG2, transcripts of all clones were detected in non-fruit tissues (FIG. 4). Transcripts of TBG 1, 4, 5 and 6 were detected in all the tissues tested. TBG3 transcript was detected at low levels in root and stem tissues, while TBG7 transcript was detected in flower and stem tissues.

Example 7 Temporal Expression Pattern in Fruit

The temporal expression pattern of the seven genes in fruit tissue was examined using RNA extracted from all fruit tissues except seeds. Transcripts for all seven genes were detected during some stage of fruit development (FIG. 5). TBG1 and 3 had similar expression patterns and their transcripts were detected throughout the breaker to over-ripe stages. TBG2 had a unique expression pattern and its transcript was detected at a constant level from 30 dpp to the over ripe stage. TBG4 expression pattern was similar to TBG1 and 3, but differed in that the transcript level was significantly higher at the turning stage. TBG5 had a similar expression pattern to TBG4 during the ripening stages of development, however TBG5 transcript was also detected throughout all the earlier stages of fruit development. TBG6 had an interesting expression pattern and its transcript was only detected at high levels in all pre-ripening stages tested. TBG7 also had a unique expression pattern and its transcript was detected at very low levels throughout all the stages tested, and at moderate levels at 10 dpp and the over-ripe stage.

Example 8 Spatial Expression Pattern in Fruit

Northern blot analysis was also done to determine transcript accumulation in various fruit tissues. Since there were temporal differences in the expression patterns of the TBG genes both the mature green and turning fruit stages were used for RNA extractions (FIG. 6). Both TBG2 and TBG6 transcripts were detected in all mature green fruit tissues tested. TBG7 transcript was present in all fruit tissues tested except for locules. Both TBG1 and TBG4 transcripts were detected in RNA samples extracted from all turning stage fruit tissues. TBG4 transcript was notably more abundant in the peel. TBG3 and TBG5 expression patterns were unique and their transcripts were detected in all tissues except the outer pericarp and locular respectively.

Example 9 Expression in Normal Versus Mutant Fruit

In order to better understand the potential roles of the TBG products and transcriptional regulatory mechanisms, northern analysis was performed using fruit tissue from the ripening mutants rin, nor and N^(r). This analysis was important because it might give clues for preliminary determination of any potential ripening and/or softening role any of the TBGs might possess.

The results of mutant fruit Northern analysis suggested that the transcriptional regulation of TBG1, 2, 3, 5 and 7 was unaffected in mutant fruit tissue and that their transcripts were present in a normal chronological (dpp) pattern (FIG. 7). The abundance of TBG4 and 6 transcripts were however different in the mutant fruit. TBG4 transcript was not detected in fruit tissue of N^(r) and was detected at much lower levels in rin and nor than wild type fruit tissues. Normally TBG6 transcripts are detectable at high levels throughout the early stages of fruit development but are not detectable after the mature green stage (40-42 dpp). TBG6 transcripts persisted even to 50 dpp in fruit of all three mutants.

Example 10 Transcriptional Regulation by Ethylene

The northern analysis done using mutant and wild type fruit suggested that TBG4 expression might be up-regulated by ethylene and that TBG6 expression might be down-regulated by ethylene. In order to evaluate this hypothesis mature green fruit were harvested and subjected to a continuous flow of 10 ppm ethylene mixed in air. Control and ethylene-treated fruit were used for RNA extractions at 1, 2, 12 and 24 hours. The results of this experiment confirmed the findings from the mutant fruit northern analysis. As expected, the presence and abundance of TBG1, 2, 3, 5 and 7 transcripts was essentially unaffected in mature green tissues subjected to exogenous ethylene treatment (FIG. 8). However, TBG4 transcript abundance was increased in mature green tissues in the presence of ethylene. From the data presented it was unclear whether TBG6 transcript abundance was reduced by exogenous ethylene treatment since its transcript level was normally reduced at this stage of fruit development.

Example 11 Enzyme Activity

In order to determine the role of the TBG encoded products we initiated experiments to express the cDNA encoded enzymes using heterologous expression systems. Several E. coli expression systems were tested, but the yield of product was very low due to toxicity ( See the example). Therefore we used a yeast expression system which secretes a mature amino-terminal-FLAG fusion protein into the culture medium. The TBG4 cDNA was tested first and resulted in the production of approximately 1 mg TBG4 active protein per 50 mls culture. TBG4 was used first because the cDNA codes for the enzyme β-galactosidase II which was purified from tomato fruit and has been characterized in some detail (Carey et al., supra; Smith et al. 1998. Plant Physiol. 117: 417-423). Therefore we could compare the activity of the heterologous system-expressed protein to the native enzyme purified from tomato. The TBG4 protein was successfully affinity purified using an anti-FLAG affinity resin (FIG. 9).

The affinity-purified TBG4 enzyme was shown to have β (1→4)-D-galactosidase activity by virtue of its ability to hydrolyze the synthetic substrate p-nitrophenyl-β-D-galactopyranoside (Smith et al., supra). The enzyme can cleave galactosyl residues from a variety of cell wall substrates and therefore has exo-galactanase activity (Table III). The remaining full-length cDNA clones are currently being tested for successful expression of active enzyme. Preliminary results have shown that TBG1 codes for an enzyme which also has both β-D-galactosidase and exo-galactanase activity (Table III). TABLE III Cell wall degrading activity of TBG4 and TBG1 expressed in yeast. Removal of galactosyl residues from chelator soluble (CSP and alkali soluble (ASP) pectin and hemicellulosic (HCF) cell wall fractions purified from tomato fruit. μg galactose released Enzyme Substrate^(c) Boiled Live TBG4^(a) CSP 0 5 ASP 0 14.5 HCF 0 4 TBG1^(b) ASP 0 1.2 ^(a)0.005 units enzyme/rx ^(b)0.0005 units enzyme/rx ^(c)2 mg substrate; 4 hr 37° C.

Example 12 pZBG2-1-4 Codes for a β-Galactosidase

The TBG4 ORF was cloned in-frame into the repressible/inducible bacterial expression vector pFLAG-CTC. The host strain XL1-Blue MR is a mutant strain containing no endogenous β-galactosidase activity nor a-complementation. Induction of gene transcription by (IPTG) caused the immediate cessation of E. coli growth at 30 to 37° C. However, induction at 20° C. did allow for some limited E. coli growth. When clones containing the pZBG2-1-4 ORF were grown at 20° C. and induced with IPTG, the cells slowly turned blue after 36 hrs growth in medium containing the β-galactosidase substrate X-GAL, (FIG. 10). If not induced with IPTG, no blue color was seen, even after extended growth in media containing X-GAL. As an additional negative control, clones consisting of XL1-Blue MR transformed with the FLAG vector alone never showed any β-galactosidase activity with or without IPTG-induction, even after 7-days growth (FIG. 10). As a positive control for maximal β-galactosidase (derived from E. coli β-galactosidase) activity the cloning vector pGEM was transformed into the host strain DH5a and the results are also shown in FIG. 10. FIG. 10 shows the detection of β-galactosidase activity from pZBG2-1-4 expression in E. coli. Cells were harvested and extracts were prepared every 12 hours and the A₆₁₅ measured. Cultures were grown with the addition of the chromogenic substrate X-GAL (open symbols) or X-GAL and the transcriptional inducer IPTG (closed symbols) in the medium. The vector used as a positive control for E. coli β-galactosidase activity was pGEM (▪) and the vector used as a negative control and for expression was pFLAG-CTC either without (∘, ●) or containing the pZBG2-1-4 ORF (⋄, ♦).

Example 13 Effects of TBG4 encoded β-galactosidase II on Plant Tissue Texture

To further demonstrate the function of TBG4 encoded β-galactosidase II the following experiments were carried out.

Fruit from tomato plants containing antisense constructs to suppress TBG4 mRNA were up to 40% firmer [compare means of parental line #1 with antisense line #2 in FIGS. 11A-11E(1-4)] than fruit from the parental line. Among the transformants the line with the firmest fruit also had the lowest overall levels of TBG4 mRNA (FIGS. 12A,B). This correlation suggests that a reduction in TBG4 mRNA is associated with increased fruit firmness. Firmer fruit might result in (1) less shipping damage (a) less loss due to damage and (b) ability to harvest at later stage resulting in better flavor at market (2) longer shelf life for both market and consumer. (3) better quality fruit for fresh slice market; fruit cut better at the pink/red stage when firmer.

To determine the function of TBG4 encoded β-galactosidase II, antisense constructs were made using the constitutively expressed 35S CaMV promoter to express TBG4 antisense RNA (FIG. 13). Constructs were moved into tomato using Agrobacterium-mediated transformation. Four tomato cultivars have been transformed in order to evaluate the effect of TBG4 suppression on processing tomato (cv ‘UC82b’) fruit paste quality and three fresh pick cultivars. Of the fresh pick cultivars one is a soft fruit large cherry tomato (cv ‘Ailsa Craig’), the second is a soft fruit old breeding line (cv ‘rutgers’) and the third is a recently developed somewhat firm cultivar ‘New Rutgers’. Among the lines where TBG4 mRNA is suppressed we expect to observe an increase in firmness and paste viscosity.

Although this project is nearly finished the complete biochemical and molecular analysis is not finished. The preliminary results on the analysis of the ‘New Rutgers’ cultivar is presented in FIGS. 11A-E(1-4) and 12A,B. In this example a fresh pick cultivar called ‘New Rutgers’ was used. Plants of the purchased seed were grown and allowed to self and the resulting seed was used as the parental control (line 1). Seven independent transformed plants (lines 2-8) containing TBG4 antisense constructs were grown and allowed to self. Transformation (T-DNA insertion) was confirmed by southern analysis (data not shown). From each transformed line, five plants were grown along with 10 parental line plants. Fruit were tagged at the breaker stage (1^(st) onset of color change) and were harvested at breaker plus 7 days. Data were taken using 15-20 fruit from each line. Each type of texture measurement was done twice for each fruit and fruit were subjected to 4 types of texture measurements using a Stable Micro System's TA-XT2i texture analyzer. The 4 measurements were; 1, 2-inch flat plate compression to 3 mm (FIG. 1A), 2, 4 mm spherical indenter compression to 3 mm (FIG. 11B), 3, 4 mm cylindrical indenter compression to 3 mm (FIG. 11C) and 4, 4 mm cylindrical indenter puncture to 10 mm (FIG. 11D). The summary of this data is shown in FIGS. 11E(1-4). In FIGS. 11A-E (1-4) line 1 was the parental line and lines 2-8 each represent an independent transformant containing one T-DNA copy of the TBG4 antisense construct. Statistical analysis (Duncans and Scheffe) of the data revealed that fruit from the transformed lines 3, 7 and 8 were not significantly different from the parental line but that transformed lines 2, 4, 5 and 6 were significantly firmer than the parental fruit. Most noteworthy is that fruit from transformed line 2 had fruit with a mean firmness that was 40% firmer than that of the parental line (FIGS. 11A-D).

Example 14 Northern Blot Analysis

We are currently investigating any changes in the biochemical composition of fruit where TBG4 mRNA levels have been suppressed. These experiments are designed to show a link between increased fruit firmness and TBG4 mRNA suppression, TBG4 encoded enzyme activity suppression, possible cell wall modification (e.g. increased galactosyl residue content) and a decrease in free galactose levels during fruit ripening.

These experiments are not complete, however some preliminary Northern blot experiments were done and the data is shown in FIGS. 12A,B. There is no parental or azygous control fruit RNA shown in FIGS. 12A,B because these plants were the last to grow and RNA extractions are just being done now. As a comparison of normal fruit TBG4 mRNA levels refer to FIG. 5 above. The data from FIG. 5 showed that TBG4 mRNA levels are low at the mature green stage, peak at the turning stage and are reduced at the red stage. All the lines except for 2 and 3 expressed antisense TBG4 mRNA (FIGS. 12A,B). The antisense transcripts appear as two bands, smaller in length than the endogenous mRNA. The two bands probably resulted from 1, the expected transcriptional stop signal provided by the NOS-terminator and 2, a cryptic transcriptional stop signal in the antisense TBG4 cDNA. The most notable result was in line 2 where no TBG4 mRNA was detected at the turning stage. Line 2 also had the firmest red fruit (see FIGS. 11A-D). The absence of detectable TBG4 mRNA probably was the result of cosupression of both the endogenous and antisense mRNAs. When compared to earlier blots (e.g. FIG. 4), all of the lines appeared to have an overall reduced level of TBG4 mRNA, but it is impossible to assign numbers to this statement without the parental and azygous control RNA on the same Northern blot.

The specification discloses that β-galactosidase II polypeptide is involved in the degradation of cell wall pectin during fruit ripening. In the present invention, the role of β-galactosidases in tomato during fruit ripening and softening and the description of the cloning of a β-galactosidase cDNA clone that codes for a β (1→4) galactan degrading enzyme, which is expressed in ripening tomato fruit tissues, has been shown.

The present work indicates that pZBG2-1-4 is a cDNA derived from the transcript of the TBG4 gene which codes for β-galactosidase II for the following reasons:

First, the deduced amino acid sequence of the highly conserved amino-terminal portion of the expected mature pZBG2-1-4 translation product matches almost exactly (28 of 30 amino acids) with the amino-terminal sequence of β-galactosidase II as purified by Carey et al. (supra) and designated TOMAA. Importantly, the two amino acids (KY) in the β-galactosidase II sequence (TOMAA), that do not match the pZBG2-1-4 deduced amino acid sequence of the present invention are believed to be incorrect since all plant β-galactosidase sequences in the database and four additional β-galactosidase-related cDNAs that were identified from tomato all match the deduced amino acid sequence of pZBG2-1-4 at these same two amino acid (ST) positions (FIG. 3).

Second, the transcript detected by pZBG2-1-4 is present in normal ripening fruit at the same time that β-galactosidase II activity was detected (FIG. 5; Carey et al., supra). Moreover, little or no transcript was detected in fruit at 45 and 50 dpa from the mutants nor, rin and Nr (FIG. 7). This observation also coincides with the data presented by Carey et al. (supra) that β-galactosidase II activity remained at levels equal to mature green fruit and did not rise in fruit 45-65 dpa from nor or rin plants. Interestingly, Carrington and Pressey (supra) have reported that β-galactosidase II activity was only detected in ‘Rutgers’ fruit after the turning stage of ripeness. The Northern data in the present invention indicates that maximum β-galactosidase II activity occurs only after the turning stage, assuming mRNA levels predict extractable enzyme activity (FIG. 5).

Third, the apparent molecular weight of 77.9 kD and pl of 8.9 for the mature protein predicted from the pZBG2-1-4 sequence is similar to that determined for β-galactosidase II., Pressey (supra), estimated a molecular weight of 62 kD by gel-filtration column chromatography and a pl of 7.8 by isoelectric focusing, while Carey et al. (supra) estimated a molecular weight of 75 kD by SDS-PAGE and a pl of 9.8 by isoelectric focusing.

Fourth, enzyme produced from pZBG2-1-4 ORF using a heterologous yeast expression system has both β-galactosidase activity and exogalactinase activity. 

1. An isolated nucleic acid molecule comprising a polynucleotide having a nucleotide sequence at least 95% identical to a sequence selected from the group consisting of: (a) a nucleotide sequence encoding the β-galactosidase II polypeptide having the complete amino acid sequence selected from the group consisting of SEQ ID NO:8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO: 15 and SEQ ID NO: 16 and designated TBG1, TBG2, TBG3, TBG4, TBG5, TBG6 and TBG7, respectively as shown in FIG. 2 or as encoded by the cDNA clone selected from the group consisting of cDNA clones contained in Gen Bank Accession No. AF023847, AF1544420, AF154421, AF020390, AF154423, AF154424 and AF154422; (b) a nucleotide sequence encoding the mature β-galactosidase II polypeptide having the amino acid sequence at about positions 24-724 selected from the group consisting of SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15 and SEQ ID NO: 16 and designated TBG1, TBG2, TBG3, TBG4, TBG5, TBG6 and TBG7, respectively as shown in FIG. 2 or as encoded by the cDNA clone selected from the group consisting of cDNA clones contained in Gen Bank Accession No. AF023847, AF1544420, AF154421, AF020390, AF154423, AF154424 and AF154422; and ( c) a nucleotide sequence complementary to any of the nucleotide sequences in (a) or (b), above.
 2. The nucleic acid molecule of claim 1 wherein said polynucleotide has the complete nucleotide sequence selected from the group consisting of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6 and SEQ ID NO:7 as shown in FIG.
 2. 3. The nucleic acid molecule of claim 1 wherein said polynucleotide has the nucleotide sequence in FIG. 2 (SEQ ID NO:4) encoding the β-galactosidase II polypeptide having the amino acid sequence designated TBG4 in FIG.
 2. 4. The nucleic acid molecule of claim 1 wherein said polynucleotide has the nucleotide sequence in FIG. 2 (SEQ ID NO:4) encoding the mature polypeptide having the amino acid sequence from about 24 to about 724 in the amino acid sequence designated TBG4 in FIG.
 2. 5. The nucleic acid molecule of claim 1 wherein said polynucleotide has the complete nucleotide sequence of the cDNA clone contained in Gen Bank Accession No. AF023847.
 6. The nucleic acid molecule of claim 1 wherein said polynucleotide has the complete nucleotide sequence of the cDNA clone contained in Gen Bank Accession No. AF1544420.
 7. The nucleic acid molecule of claim 1 wherein said polynucleotide has the complete nucleotide sequence of the cDNA clone contained in Gen Bank Accession No. AF154421.
 8. The nucleic acid molecule of claim 1 wherein said polynucleotide has the complete nucleotide sequence of the cDNA clone contained in Gen Bank Accession No. AF020390.
 9. The nucleic acid molecule of claim 1 wherein said polynucleotide has the complete nucleotide sequence of the cDNA clone contained in Gen Bank Accession No. AF154423.
 10. The nucleic acid molecule of claim 1 wherein said polynucleotide has the complete nucleotide sequence of the cDNA clone contained in Gen Bank Accession No. AF154424.
 11. The nucleic acid molecule of claim 1 wherein said polynucleotide has the complete nucleotide sequence of the cDNA clone contained in Gen Bank Accession No. AF154422.
 12. An isolated nucleic acid molecule comprising a polynucleotide which hybridizes under stringent hybridization conditions to a polynucleotide having a nucleotide sequence identical to a nucleotide sequence in (a), (b), or © of claim 1 wherein said polynucleotide which hybridizes does not hybridize under stringent hybridization conditions to a polynucleotide having a nucleotide sequence consisting of only A residues or of only T residues.
 13. An isolated nucleic acid molecule comprising a polynucleotide which encodes the amino acid sequence of an epitope-bearing portion of a β-galactosidase II polypeptide having an amino acid sequence in (a), (b), or © of claim
 1. 14. A method for making a recombinant vector comprising inserting an isolated nucleic acid molecule of claim 1 into a vector.
 15. A recombinant vector produced by the method of claim
 14. 16. A method of making a recombinant host cell comprising introducing the recombinant vector of claim 15 into a host cell.
 17. A recombinant host cell produced by the method of claim
 16. 18. A recombinant method for producing β-galactosidase II polypeptide, comprising culturing the recombinant host cell of claim 17 under conditions such that said polypeptide is expressed and recovering said polypeptide.
 19. An isolated β-galactosidase II polypeptide comprising an amino acid sequence at least 95% identical to a sequence selected from the group consisting of: a) amino acid sequence at about positions 24-724 selected from the group consisting of sequences SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15 and SEQ ID NO: 16 and designated TBG1, TBG2, TBG3, TBG4, TBG5, TBG6 and TBG7, respectively as shown in FIG. 2; and b) amino acid sequence as encoded by the cDNA clone selected from the group consisting of cDNA clones contained in Gen Bank Accession No. AF023847, AF1544420, AF154421, AF020390, AF154423, AF154424 and AF154422.
 20. An isolated polypeptide comprising an epitope-bearing portion of the β-galactosidase II protein.
 21. An isolated antibody that binds specifically to a β-galactosidase II polypeptide of claim
 20. 22. An isolated nucleic acid molecule nucleic acid molecule comprising a polynucleotide having a nucleotide sequence at least 95% identical to a sequence selected from the group consisting of: (a) a nucleotide sequence encoding the β-galactosidase II polypeptide having the complete amino acid sequence selected from the group consisting of SEQ ID NO:8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO: 15 and SEQ ID NO: 16 and designated TBG1, TBG2, TBG3, TBG4, TBG5, TBG6 and TBG7, respectively as shown in FIG. 3 or as encoded by the cDNA clone selected from the group consisting of cDNA clones contained in Gen Bank Accession No. AF023847, AF1544420, AF154421, AF020390, AF154423, AF154424 and AF154422; (b) a nucleotide sequence encoding the mature β-galactosidase II polypeptide having the amino acid sequence at about positions 24-724 selected from the group consisting of sequences SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15 and SEQ ID NO: 16 and designated TBG1, TBG2, TBG3, TBG4, TBG5, TBG6 and TBG7, respectively as shown in FIG. 3 or as encoded by the cDNA clone selected from the group consisting of cDNA clones contained in Gen Bank Accession No. AF023847, AF1544420, AF154421, AF020390, AF154423, AF154424 and AF154422; and (c) a nucleotide sequence complementary to any of the nucleotide sequences in (a). or (b), above.
 23. The nucleic acid molecule of claim 22 wherein said polynucleotide has a complete nucleotide sequence in FIG. 2 selected from the group consisting of SEQ ID NOs: 1-3 and 5-7.
 24. The nucleic acid molecule of claim 22 wherein said polynucleotide has a nucleotide sequence in FIG. 2 selected from the group consisting of SEQ ID NOs: 1-3 and 5-7 encoding the β-galactosidase polypeptide having the complete amino acid sequence designated TBG1-3 and 5-7, respectively.
 25. The nucleic acid molecule of claim 22 wherein said polynucleotide has the nucleotide sequence in FIG. 2 selected from the group consisting of SEQ ID NOs: 1-3 and 5-7encoding the mature polypeptide having the amino acid sequence designated TBG1-3 and 5-7, respectively.
 26. The nucleic acid molecule of claim 22 wherein said polynucleotide has the complete nucleotide sequence of the cDNA clone contained in an Gen Bank Accession No. selected from the group consisting of ATCC Deposit No. selected from the group consisting of AF023847, AF1544420, AF154421, AF020390, AF154423, AF154424 and AF154422.
 27. A method of modifying cell wall metabolism in a plant which comprises transforming said plant with a DNA construct adapted to modify the activity of a β-galactosidase, growing said plant or its descendent and selecting a plant having modified cell wall characteristics, said construct comprising a transcriptional initiation region operative in plants operably linked to a DNA sequence encoding at least one β-galactosidase.
 28. A method as claimed in claim 27, wherein said DNA sequence is selected from the group consisting of the sequences of nucleic acid molecules claimed in claim 1 or claim
 22. 29. A plant cell transformed with a nucleic acid molecule as claimed in claim 1 or claim
 22. 30. A plant derived from a plant cell as claimed in claim
 29. 31. A plant seed derived from a plant as claimed in claim
 30. 32. A method for modifying β-galactosidase gene expression in a plant comprising transforming said plant with a nucleic acid molecule as claimed in claim 1 or claim 22, growing the transformed plant and selecting a plant having modified β-galactosidase gene expression when compared with an untransformed plant. 